Assessing treatment efficacy for interval-censored endpoints using multistate semi-Markov models fit to multiple data streams

Jon Fintzi

Statistical Methodology and Innovation, Bristol Myers Squibb

August 4, 2025

REGEN-2069 trial of mAb as COVID-19 prophylaxis

Monoclonal antibody (mAb), REGEN-COV, for prevention of COVID-19.

  • Primary endpoint was symptomatic infection within 28 days.

  • SARS-CoV-2 naïve unvaccinated participants enrolled within 96 hours of household index case.

  • Randomized 1:1 to mAb vs. placebo.

  • Monitored continuously for symptoms. Also, weekly nasopharyngeal swabs for RT-qPCR and serology for anti-nucleocapsid antibodies at 28 days.

    • PCR indicates ongoing viral shedding.
    • Positive serology is a marker of immune response to infection.
  • 81.4% reduction in risk of symptomatic infection (odds ratio, 0.17; 95% CI, 0.09 to 0.33; p < 0.001).

Goals in secondary analysis:

  • Protective efficacy (PE) against infection (symptomatic + asymptomatic).
  • Cumulative incidence of infection over the 28 day study period.
  • Seroconversion following infection.
  • Duration of detectable viral shedding.

Difficulty: infection is not continuously observed.

Note: From now on, infection = participant is measureably affected by infection.

Data assimilation

Strategy: combine PCR, symptom, and serology data.

  • No one data stream completely captures infection.
  • More kicks at the can to detect infections.
  • Scientific collaborators use their clinical and immunological expertise to help us formulate a model.

Modeling multiple data streams (big picture)

Study participants transition through discrete states of infection and immune response:

  • 1 = Infection naïve,
  • 2 = PCR+, no history of symptoms,
  • 3 = No longer PCR+, no history of symptoms,
  • 4 = PCR+ with history of symptoms,
  • 5 = No longer PCR+ with history of symptoms.

Multistate model for infection and immunological progression. Arrows are model transitions, dashed lines represent anti-nucleocapsid positivity at D28.

Idea: PCR, symptoms, and serology are breadcrumbs about each participant’s trajectory.

Hypothetical data for three study participants. PCR is assessed weekly, symptoms are monitored continuously, and serology is assessed at day 28. The derived information is used to infer the possible model states.

Model formulation (big picture)

A multistate model is characterized by

  • a set of distinct states, and
  • transition intensities (like hazards for survival) from one state to another, \[ \begin{align*} \scriptsize \lambda_{ij}(t\mid H(t^-)) = \underset{\Delta t \downarrow 0}{\mathrm{lim}}\frac{\Pr\left(\text{State at }t^-+\Delta t = i\ \vert\ \text{State at }t^- = i,\ \text{history up to }t^-\right)}{\Delta t} \end{align*} \] where time, \(t\), is the time since entry to the current state, \(i\).
  • Convenient to adopt a proportional transition intensity parameterization for treatment effects.

Temporal dynamics of clinical progression:

  • Markov models:

    • Transition intensities independent of state sojourn.
    • Can be used to analyze data under intermittent observation and hidden states.
    • Often unrealistic, commonly used for computational reasons.
  • Semi-Markov models:

    • Transition intensities depend on duration of state occupancy.
    • More realistic but difficult to fit.

Here’s the problem

Data are a coarse reflection of a latent biological process in continuous time.

  • Coarse data with different temporal resolutions and complicated censoring.
  • Incomplete identification of state labels at observation times.
  • Likelihood is a product of transition probabilities over inter-observation intervals.

Marginal likelihoods for semi-Markov processes integrate over the number + timing of unobserved state transitions.

  • Tractable for simple progressive processes, very difficult to evaluate in general.
  • No Kolmogorov forward equation as in the Markov case.

Methodological contribution

Recall, in the expectation-maximization (EM) algorithm we alternate between:

  • E-step: calculate the expected complete data log-likelihood, i.e., Q-function, to average over missing data.

  • M-step: maximize the Q-function.

  • Wash, rinse, repeat until convergence to obtain an MLE.

Key innovation: Monte Carlo expectation-maximization framework for fitting multistate semi-Markov models to coarsened data.

  • E-step: approximate Q-function via Monte Carlo – average log-likelihood over paths that are sampled conditionally on the data.

  • M-step: maximize the Q-function.

  • Wash, rinse, repeat until convergence to obtain an MLE.

  • But how should we sample the paths?

Our proposal:

  • Sample latent paths using a Markov surrogate + standard algorithms for HMMs and endpoint conditioned Markov chains.

  • Model agnostic algorithm - can accomodate complex model structure, semi-parametric intensities, and coarse data.
  • Details in the manuscript, available on ArXiv: https://arxiv.org/abs/2501.14097.

Models

Fit 6 models varying in their flexibility.

  • Markov model: exponential transition intensities as fit by the msm package in R (only off-the-shelf game in town for panel data with state misclassification).

  • Semi-Markov models:

    • Weibull transition intensities (fully parametric).
    • B-splines (flexible) for infection (1-2) + symptom onset (2-4).

Parameterizations of transition intensities with time, \(t\), being time since state entry.

Results

We choose the second most flexible model by AIC.

  • Gain 100 units of log-likelihood over time-homogeneous Markov model.
  • Gain 78.7 units of log-likelihood over parametric Weibull intensity model.

Model comparison. The degree and interior knots of the B-Spline intensities are \(\delta\) and \(\xi\), respectively.

Results

Incidence of symptomatic infection (left) and all-comer infection (right)

Cumulative incidence of symptomatic infection

Cumulative incidence of all-comer infection
  • Good fit for incidence of symptomatic infection (continuously observed).
  • Large gap in week 1 between observed infections and smooth estimate reflects early infections not caught until first PCR or symptom onset.
  • Persistent discrepancy is short shedders not captured by weekly PCR.

Key takeaways from the analysis:

Multifaceted benefit of mAb prophylaxis:

  • Reduced the risk of infection (PE = 1 - relative risk):

    • Overall infection: PE = 60.4% (95% CI: 44.9%, 72.5%),
    • Symptomatic: PE = 83.6% (95% CI: 69.4%, 93.1%),
    • Asymptomatic: PE = 38.7% (95% CI: 10.0%, 60.8%).
  • Reduced the risk of symptoms following infection.

    • RR = 41.2%; (95% CI: 18.9%, 67.7%).

Lower rate of seroconversion following infection.

  • RR of seroconversion = 31.9% (95% CI: 22.3%, 44.6%).
  • Consistent with less intense immune response to lower viral loads.

Shortened duration of detectable viral shedding:

  • mAb: 6.2 days (95% CI: 5.0 days, 7.8 days),
  • Placebo: 13.0 days (95% CI: 11.5 days, 14.6 days).

Wrapping up

Contributions on methodology and implementation

  • Fit semi-Markov models to data with complex coarsening patterns.
  • Algorithm is agnostic of model structure, but not a panacea for non-identifiability where it can fail loudly (a strength).
  • Can accomodate splines + other flexible functions.
  • Flexible and general impementation in R/Julia.

Methodological extensions (“low-hanging fruit”)

  • Disease driven observation schemes + preferential sampling.
  • Penalized splines and approximate cross-validation approximation for semi-parametric inference with automatic smoothing.
  • Fast robust uncertainty quantification.
  • Phase-type proposal distributions.

Please reach out if you want to collaborate on an analysis or methods! 😊 (jonathan.fintzi@bms.com)

Thank you!

And also thanks to my excellent collaborators!

  • Raphaël Morsomme (FDA),
  • C. Jason Liang, Allyson Mateja, Dean Follmann (NIAID, NIH),
  • CG Wang, Meagan O’Brien (Regeneron).

Supplementary slides

Crude tabulation of participant outcomes

Additional results

Additional results

Comparison with phase-type models

Setup:

  • Recurrent illness-death model, monthly observations over 1 year, death observed exactly. N = 1000.
  • Health -> Ill is Weibull with increasing intensity, other transitions exponential.
  • Models: time-homogeneous Markov, phase-type with 2 latent states for healthy -> ill transition, semi-Markov with spline for healthy -> ill transition (degree 1, interior knot at 0.5 chosen arbitrarily).

Efficiency vs. rejection sampling

Compute effort to obtain complete trajectories for a single transition. A&B = rejection sampler from Aralis & Brookmyer (2019).